James-Stein shrinkage to improve k-means cluster analysis

نویسندگان

  • Jinxin Gao
  • David B. Hitchcock
چکیده

We study a general algorithm to improve accuracy in cluster analysis that employs the James-Stein shrinkage effect in k-means clustering. We shrink the centroids of clusters toward the overall mean of all data using a James-Stein-type adjustment, and then the James-Stein shrinkage estimators act as the new centroids in the next clustering iteration until convergence. We compare the shrinkage results to the traditional k-means method. Monte Carlo simulation shows that the magnitude of the improvement depends on the within-cluster variance and especially on the effective dimension of the covariance matrix. Using the Rand index, we demonstrate that accuracy increases significantly in simulated data and in a real data example.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Applications of James–Stein Shrinkage (II): Bias Reduction in Instrumental Variable Estimation

In a two-stage linear regression model with Normal noise, I consider James–Stein type shrinkage in the estimation of the first-stage instrumental variable coefficients. For at least four instrumental variables and a single endogenous regressor, I show that the standard two-stage least-squares estimator is dominated with respect to bias. I construct the dominating estimator by a variant of James...

متن کامل

Applications of James–Stein Shrinkage (I): Variance Reduction without Bias

In a linear regression model with homoscedastic Normal noise, I consider James–Stein type shrinkage in the estimation of nuisance parameters associated with control variables. For at least three control variables and exogenous treatment, I show that the standard leastsquares estimator is dominated with respect to squared-error loss in the treatment effect even among unbiased estimators and even...

متن کامل

James-Stein type estimator by shrinkage to closed convex set with smooth boundary

We give James-Stein type estimators of multivariate normal mean vector by shrinkage to closed convex set K with smooth or piecewise smooth boundary. The rate of shrinkage is determined by the curvature of boundary of K at the projection point onto K . By considering a sequence of polytopes K j converging to K , we show that a particular estimator we propose is the limit of a sequence of estimat...

متن کامل

James-Stein Type Center Pixel Weights for Non-Local Means Image Denoising

Non-Local Means (NLM) and its variants have proven to be effective and robust in many image denoising tasks. In this letter, we study approaches to selecting center pixel weights (CPW) in NLM. Our key contributions are: 1) we give a novel formulation of the CPW problem from a statistical shrinkage perspective; 2) we construct the James-Stein shrinkage estimator in the CPW context; and 3) we pro...

متن کامل

Cluster-Seeking James-Stein Estimators

This paper considers the problem of estimating a high-dimensional vector of parameters θ ∈ R from a noisy observation. The noise vector is i.i.d. Gaussian with known variance. For a squared-error loss function, the James-Stein (JS) estimator is known to dominate the simple maximum-likelihood (ML) estimator when the dimension n exceeds two. The JS-estimator shrinks the observed vector towards th...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Computational Statistics & Data Analysis

دوره 54  شماره 

صفحات  -

تاریخ انتشار 2010